Accelerating deep neural network training with inconsistent stochastic gradient descent
نویسندگان
چکیده
منابع مشابه
Accelerating deep neural network training with inconsistent stochastic gradient descent
Stochastic Gradient Descent (SGD) updates Convolutional Neural Network (CNN) with a noisy gradient computed from a random batch, and each batch evenly updates the network once in an epoch. This model applies the same training effort to each batch, but it overlooks the fact that the gradient variance, induced by Sampling Bias and Intrinsic Image Difference, renders different training dynamics on...
متن کاملRecurrent neural network training with preconditioned stochastic gradient descent
Recurrent neural networks (RNN), especially the ones requiring extremely long term memories, are difficult to training. Hence, they provide an ideal testbed for benchmarking the performance of optimization algorithms. This paper reports test results of a recently proposed preconditioned stochastic gradient descent (PSGD) algorithm on RNN training. We find that PSGD may outperform Hessian-free o...
متن کاملAccelerating Stochastic Gradient Descent
There is widespread sentiment that fast gradient methods (e.g. Nesterov’s acceleration, conjugate gradient, heavy ball) are not effective for the purposes of stochastic optimization due to their instability and error accumulation. Numerous works have attempted to quantify these instabilities in the face of either statistical or non-statistical errors (Paige, 1971; Proakis, 1974; Polyak, 1987; G...
متن کاملSpeeDO: Parallelizing Stochastic Gradient Descent for Deep Convolutional Neural Network
Convolutional Neural Networks (CNNs) have achieved breakthrough results on many machine learning tasks. However, training CNNs is computationally intensive. When the size of training data is large and the depth of CNNs is high, as typically required for attaining high classification accuracy, training a model can take days and even weeks. In this work, we propose SpeeDO (for Open DEEP learning ...
متن کاملNatural Gradient Descent for Training Stochastic Complex-Valued Neural Networks
In this paper, the natural gradient descent method for the multilayer stochastic complex-valued neural networks is considered, and the natural gradient is given for a single stochastic complex-valued neuron as an example. Since the space of the learnable parameters of stochastic complex-valued neural networks is not the Euclidean space but a curved manifold, the complex-valued natural gradient ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Neural Networks
سال: 2017
ISSN: 0893-6080
DOI: 10.1016/j.neunet.2017.06.003